dMPI: Facilitating Debugging of MPI Programs via Deterministic Message Passing

نویسندگان

  • Xu Zhou
  • Kai Lu
  • Xicheng Lu
  • Xiaoping Wang
  • Baohua Fan
چکیده

This paper presents a novel deterministic MPI implementation (dMPI) to facilitate the debugging of MPI programs. Distinct from existing approaches, dMPI ensures inherent determinism without using any external support (e.g., logs), which achieves convenience and performance simultaneously. The basic idea of dMPI is to use deterministic logical time to solve message races and control asynchronous transmissions, thus we could eliminate the nondeterministic behaviors of the existing message passing mechanism. To avoid deadlocks introduced by dMPI, we also integrate dMPI with a lightweight deadlock checker to dynamically detect and solve these deadlocks. We have implemented dMPI and evaluated it using NPB benchmarks. The results show that dMPI could guarantee determinism with incurring modest overhead (8% on average).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Implementation of Race Detection and Deterministic Replay with MPI

The Parallel Debugging Tool (PDT) of the Annai programming environment is developed within the Joint CSCS-ETH/NEC Collaboration in Parallel Processing. Similarly to the other components of the integrated environment, PDT aims to provide support for application developers to debug portable large-scale data-parallel programs based on HPF, and message-passing programs based on the MPI standard. Fo...

متن کامل

Two automated techniques for analyzing and debugging Mpi-based programs

Message Passing Interface (MPI) is the most commonly used paradigm in writing parallel programs since it can be employed not only within a single processing node but also across several connected ones. Data flow analysis concepts, techniques and tools are needed to understand and analyze MPI-based programs to detect bugs arise in these programs. In this paper we propose two automated techniques...

متن کامل

MARMOT: An MPI Analysis and Checking Tool

The Message Passing Interface (MPI) is widely used to write parallel programs using message passing. MARMOT is a tool to aid in the development and debugging of MPI programs. This paper presents the situations where incorrect usage of MPI by the application programmer is automatically detected. Examples are the introduction of irreproducibility, deadlocks and incorrect management of resources l...

متن کامل

Debugging Tool for Localizing Faulty Processes in Message Passing Programs

In message passing programs, once a process terminates with an unexpected error, the terminated process can propagate the error to the rest of processes through communication dependencies, resulting in a program failure. Therefore, to locate faults, developers must identify the group of processes involved in the original error and faulty processes that activate faults. This paper presents a nov...

متن کامل

Debugging Message Passing Programs Using Invisible Message Tags

Source level debuggers for parallel PVM or MPI programs currently ooer good support for debugging multiple processes, however, they still lack adequate mechanisms for debugging message passing errors. In this paper, we present a new concept called message breakpoints, which allows to follow the information ow between processes. We also show how these breakpoints can be implemented very eecientl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012